IRIT: Textual Similarity Combining Conceptual Similarity with an N-Gram Comparison Method
نویسندگان
چکیده
This paper describes the participation of the IRIT team to SemEval 2012 Task 6 (Semantic Textual Similarity). The method used consists of a n-gram based comparison method combined with a conceptual similarity measure that uses WordNet to calculate the similarity between a pair of concepts.
منابع مشابه
FBK-HLT: A New Framework for Semantic Textual Similarity
This paper reports the description and performance of our system, FBK-HLT, participating in the SemEval 2015, Task #2 “Semantic Textual Similarity”, English subtask. We submitted three runs with different hypothesis in combining typical features (lexical similarity, string similarity, word n-grams, etc) with syntactic structure features, resulting in different sets of features. The results eval...
متن کاملAn Effective Sentence Ordering Approach For Multi-Document Summarization Using Text Entailment
With the rapid development of modern technology electronically available textual information has increased to a considerable amount. Summarization of textual information manually from unstructured text sources creates overhead to the user, therefore a systematic approach is required. Summarization is an approach that focuses on providing the user with a condensed version of the original text bu...
متن کاملLIPN-CORE: Semantic Text Similarity using n-grams, WordNet, Syntactic Analysis, ESA and Information Retrieval based Features
This paper describes the system used by the LIPN team in the Semantic Textual Similarity task at SemEval 2013. It uses a support vector regression model, combining different text similarity measures that constitute the features. These measures include simple distances like Levenshtein edit distance, cosine, Named Entities overlap and more complex distances like Explicit Semantic Analysis, WordN...
متن کاملEvaluation and Comparison of Concept Based and N-Grams Based Text Clustering Using SOM
With the great and rapidly growing number of documents available in digital form (Internet, library, CD-Rom...), the automatic classification of texts has become a significant research field and a fundamental task in document processing. This paper deals with unsupervised classification of textual documents also called text clustering using Self-Organizing Maps of Kohonen in two new situations:...
متن کاملSyntactic Similarity
A two-way Textual Entailment (TE) recognition system that uses lexical and syntactic features has been described in this paper. The TE system is rule based that uses lexical and syntactic similarities. The important lexical similarity features that are used in the present system are: WordNet based uni-gram match, bi-gram match, longest common sub-sequence, skip-gram, stemming. In the syntactic ...
متن کامل